Factoring Adjunction in Hierarchical Phrase-Based SMT

نویسندگان

  • Sophie Arnoult
  • Khalil Sima’an
چکیده

While much work has been done to inform Hierarchical Phrase-Based SMT (Chiang, 2005) models linguistically, the adjunct/argument distinction has generally not been exploited for these models. But as Shieber (2007) points out, capturing this distinction allows to abstract over ‘intervening’ adjuncts, and is thus relevant for (machine) translation in general. We contribute an adjunction-driven approach to hierarchical phrase-based modelling that uses source-side adjuncts to relax extraction constraints–allowing to capturing long-distance dependencies–, and to guide translation through labelling. The labelling scheme can be reduced to two adjunct/non-adjunct labels, and improves translation over Hiero by up to 0.6 BLEU points for English-Chinese.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Grammar Correction Using Hierarchical Phrase-Based Statistical Machine Translation

We introduce a novel technique that uses hierarchical phrase-based statistical machine translation (SMT) for grammar correction. SMT systems provide a uniform platform for any sequence transformation task. Thus grammar correction can be considered a translation problem from incorrect text to correct text. Over the years, grammar correction data in the electronic form (i.e., parallel corpora of ...

متن کامل

Offline Extraction of Overlapping Phrases for Hierarchical Phrase-Based Translation

Standard SMT decoders operate by translating disjoint spans of input words, thus discarding information in form of overlapping phrases that is present at phrase extraction time. The use of overlapping phrases in translation may enhance fluency in positions that would otherwise be phrase boundaries, they may provide additional statistical support for long and rare phrases, and they may generate ...

متن کامل

The RWTH Aachen German-English Machine Translation System for WMT 2014

This paper describes the statistical machine translation (SMT) systems developed at RWTH Aachen University for the German→English translation task of the ACL 2014 Eighth Workshop on Statistical Machine Translation (WMT 2014). Both hierarchical and phrase-based SMT systems are applied employing hierarchical phrase reordering and word class language models. For the phrase-based system, we run dis...

متن کامل

Hierarchical Phrase-Based Statistical Machine Translation System

The aim of this thesis is to express fundamentals and concepts behind one of the emerging techniques in statistical machine translation (SMT) hierarchical phrase based MT by implementing translation from Hindi to English. Basically hierarchical model extends phrase based models by considering subphrases with the aid of context free grammar (CFG). In other models, syntax based models bear a rese...

متن کامل

Shallow Semantic Trees for SMT

We present a translation model enriched with shallow syntactic and semantic information about the source language. Base-phrase labels and semantic role labels are incorporated into an hierarchical model by creating shallow semantic “trees”. Results show an increase in performance of up to 6% in BLEU scores for English-Spanish translation over a standard phrase-based SMT baseline.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016